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Abstract:  In  this  paper  we  describe  an  approach  to  the  extension  of  geographic  information  systems  to  take 
advantage  of  the  continuing  development  of  capabilities  of  the  Semantic  Web.  This  is  presented  in  the  context  of  a 
portal  based  Geospatial  Information  Database  (GIDB™),  an  object-oriented  spatial  database  capable  of  storing  multiple 
data  types  from  multiple  sources.  We  have  developed  our  approach  for  a  specific  domain,  spatially  oriented 
meteorological  and  oceanographic,  but  this  can  clearly  be  applied  to  other  spatial  data  domains.  Finally  we  illustrate  the 
use  of  the  ontology  development  system  based  on  Generative  Sublanguage  Ontologies  (GSO),  a  type  of  linguistic 
ontology  inspired  by  the  Generative  Lexicon  Theory,  to  develop  effective  domain  ontologies. 


Introduction 

Traditionally  analyses  based  on  Geographic  Information  Systems  (GIS)  have  mostly 
accessed  their  own  local  data  store  or  spatial  database.  As  the  Internet  has  evolved,  much 
more  relevant  data  is  available  and  must  be  taken  into  account  in  GIS  decision-making. 
The  further  development  of  the  Semantic  Web  and  Web  Services  technology  offers  the 
capability  of  effectively  and  efficiently  discovering  and  accessing  data.  GIS  technology 
must  be  extended  to  take  advantage  of  these  new  web-oriented  capabilities  such  as 
described  in  the  Geography  Mark-Up  Language:  GML  [LA  2004],  In  this  paper  we  shall 
discuss  an  object-oriented  geographic  data  portal  that  incorporates  Web  Services 
capabilities.  Our  specific  application  context  is  that  of  spatially  oriented  meteorological 
and  oceanographic  (MetOc)  data,  but  the  approach  should  be  applicable  to  any  form  of 
spatial  data.  The  web-based  extension  of  the  system  is  implemented  by  a  specialized  web 
broker  utilizing  ontologies  for  MetOc  data.  Linally  a  brief  discussion  of  the  potential  use 


of  fuzzy  set  based  ontological  representations  is  given. 
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Timely  provision  of  spatially  based  MetOc  data  is  essential  in  diverse  areas  such  as 
emergency  planning  for  severe  storms,  fishing  fleet  co-ordination,  most  military 
operations  etc.  For  example,  the  D-Day  landings  in  Normandy  were  critically  affected  by 
weather  with  the  massive  operation  once  being  postponed  24  hours  based  on  metrological 
forecasts.  The  need  for  information  on  weather  and  sea  conditions  is  just  as  relevant 
today.  In  order  to  plan  an  amphibious  beach  landing  a  Special  Operations  unit  must  know 
about  the  possible  sea  state  conditions  to  decide  the  type  of  craft  they  can  operate 
effectively.  Thus  there  is  a  need  to  access  appropriate  MetOc  data  and  forecasts  for  an 
operational  area  that  is  shared  throughout  the  planning  process. 

Data  integration  is  a  pervasive  issue  in  many  areas  such  as  data  warehouses  and 
federated/distributed  databases  [EN  2004].  GIS  access  to  and  retrieval  of  data  from 
heterogeneous  sources  in  a  distributed  system  such  as  the  Internet  also  poses  many 
difficulties.  Assimilation  of  spatio-temporal  data  from  Web-based  sources  means  that 
differences  in  notation,  terminology,  usage,  etc.  prevent  simple  querying  and  retrieval  of 
data.  These  factors  have  been  extensively  explored  before  the  Web  for  the  process  of 
conflation  of  spatial  data  in  which  maps  are  merged  to  yield  higher  quality,  more  accurate 
products  [CH  1998,  RA  2002]. 

The  recognition  of  such  integration  difficulties  has  influenced  many  of  the  concepts 
that  are  embodied  in  the  Semantic  Web.  Ontology  tools  have  been  developed  to  support 
the  goal  of  sharing  knowledge  for  various  domains  of  interest.  Currently  the  development 
of  ontologies  for  geosciences  data  has  been  limited.  This  has  restricted  the  usage  of  the 
full  potential  of  the  Semantic  Web  in  the  area  of  GIS  [RA  2005]. 
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Geographic  Information  Systems  and  Data  Servers 

Geographic  information  systems  have  become  a  major  tool  in  a  multitude  of  areas  for 
both  commercial  and  governmental  purposes  worldwide.  A  key  aspect  of  a  GIS  is  the 
underlying  spatial  database  that  supplies  the  volumes  of  various  types  of  data  needed  for 
the  variety  of  applications  that  have  motivated  their  usage.  There  are  two  main  types  of 
spatial  data  in  these  databases,  vector  and  raster.  Vector  geographic  features  use 
geometric  primitives  such  as  points,  lines,  curves,  and  polygons  to  represent  map  features 
such  as  roads,  rivers,  political  boundaries,  etc.  Raster  geographic  data  types  are  generally 
structures  that  consist  of  arrays  of  pixels  with  given  values.  This  can  include  scanned 
maps  and  charts,  and  airborne,  satellite,  and  sonar  imagery  among  others  [SM  2005]. 

Although  in  many  applications  the  data  required  is  already  present  in  the  spatial 
database,  it  is  becoming  more  common  that  some  of  the  data  will  be  obtained  from  the 
Internet.  Our  main  concern  here  is  how  spatial  data  can  be  obtained  over  the  web  and  the 
types  of  geographic  data  servers  used  to  access  the  data.  Geographic  data  servers  can  be 
quite  varied.  Some  are  built  on  robust  database  management  systems  (DBMS).  Others  are 
simply  data  transport  mechanisms  for  sensor  data  or  other  observations. 

The  most  basic  types  of  geographic  data  servers  can  be  as  simple  as  a  web  page  or 
FTP  (File  Transport  Protocol)  site  with  geographic  data  files  available.  For  example, 
public  and  private  weather  services  provide  imagery  and  forecasts  on  the  websites  in  the 
form  of  pre-rendered  maps.  Another  class  of  servers  are  more  comprehensive  software 
systems  that  provide  a  user  with  a  complete,  often  specialized,  map  view.  These  are 
usually  expensive  and  advanced  server  systems,  which  include  a  DBMS,  fully  functional 
geographic  information  system  (GIS),  and  some  type  of  map  Tenderer.  Many  of  these 
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systems  require  users  to  use  a  specific  client  software  package  to  access  the  server. 
Several  vendors  currently  provide  these  types  of  software;  examples  are  ESRI's  Arc  IMS 
and  AutoDesk’s  MapGuide.  Interfaces  to  these  types  of  servers  vary  and  can  be 
troublesome  to  integrate  and  typically  involve  a  mixture  of  open  and  closed  proprietary 
protocols.  A  more  general  approach  using  an  open-source  object-oriented  database  is 
described  in  the  next  section 
GIDB™  -  An  Object-Oriented  Database 

The  Digital  Mapping,  Charting  and  Geodesy  Analysis  Program  (DMAP)  at  the  Naval 
Research  Laboratory  has  been  actively  involved  in  the  development  of  a  digital 
geospatial  mapping  and  analysis  system  since  1994.  [CO  1998;  NE  2001].  The  core  of 
system  is  the  Geospatial  Information  Database  (GIDB™),  an  object-oriented  spatial 
database  capable  of  storing  multiple  data  types  from  multiple  sources. 

The  GIDB  includes  an  object-oriented  data  model,  an  object-oriented  database 
management  system  (OODBMS)  and  various  analysis  tools.  While  the  model  provides 
the  design  of  classes  and  hierarchies,  the  OODBMS  provides  an  effective  means  of 
control  and  management  of  objects  on  disk  such  as  locking,  transaction  control,  etc.  The 
database  component  of  the  system  is  now  implemented  in  an  open  source,  all-Java, 
object-oriented  database  management  system  called  Ozone  [OZ  2003].  Spatial  and 
temporal  analysis  tools  include  query  interaction,  multimedia  support  and  map 
symbology  support.  The  GIDB  offers  3D  terrain  visualizations  with  map  overlay  [LA 
2000].  Users  can  query  the  database  by  area-of-interest,  time-of-interest,  distance  and 
attribute.  For  example,  statistics  and  data  plots  can  be  generated  to  reflect  wave  height 


4 


for  a  given  span  of  time  at  an  ocean  sensor.  Interfaces  are  implemented  to  afford 
compatibility  with  Arc/Info,  Oracle  8i,  Matlab,  and  others. 

An  object-oriented  approach  has  been  beneficial  in  dealing  with  complex  spatial  data, 
and  it  has  also  permitted  integration  of  a  variety  of  raster  and  vector  data  products  in  a 
common  database.  The  raster  data  include  Next  Generation  Radar  (NEXRAD)  and  other 
weather  radar  and  weather  satellite  output,  Compressed  ARC  Digitized  Raster  Graphics 
(CADRG),  Controlled  Image  Base  (CIB),  jpeg  and  video.  Vector  data  includes  Vector 
Product  Format  (VPF)  products  from  the  National  Geospatial  Intelligence  Agency 
(NGA),  Shape,  real-time  and  in-situ  sensor  data  and  Digital  Terrain  Elevation  Data 
(DTED). 

A  communications  gateway  or  portal  enables  users  to  obtain  data  from  a  variety  of 
data  providers  distributed  over  the  Internet  in  addition  to  the  GIDB  including  for  example 
USGS,  Digital  Earth/NASA,  the  Geography  Network/ESRI  and  the  Fleet  Numerical 
Meteorology  and  Oceanography  Center  (FNMOC).  This  portal  establishes  a  well- 
defined  interface  that  brings  together  such  heterogeneous  data  for  a  common  geo- 
referenced  presentation  to  the  user.  [WI  2003].  Differences  in  data  formats  are  resolved 
to  a  uniform  format  and  all  data  is  re-projected  to  a  uniform  map  projection.  An 
illustration  of  the  interface  for  a  typical  data  request  is  shown  in  Figure  1 . 
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Figure  1.  GIDB  Interface 


Web  Services 

In  this  section  we  overview  some  of  the  technology  of  Web  Services  as  needed  for  the 
description  of  our  web-enhanced  GIS  system.  Web  Services  provide  data  and  services  to 
users  and  applications  over  the  Internet.  The  most  commonly  used  Web  Services 
standards  and  protocols  include,  but  are  not  necessarily  limited  to,  the  Extensible  Markup 
Language  (XML),  Simple  Object  Access  Protocol  (SOAP),  the  Web  Services  Definition 
Language  (WSDL)  and  Universal  Discovery  Description  and  Integration  (UDDI)  [DI 
2000], 
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XML  is  a  language  used  to  define  data  in  a  platform  and  programming  language 
independent  manner.  XML  has  become  one  of  the  widely  used  standards  in  interoperable 
exchange  of  data  on  the  Internet  but  does  not  define  the  semantics  of  the  data  it  describes. 
Instead,  the  semantics  of  an  XML  document  are  defined  by  the  applications  that  process 
them. 

XML  Schemas  define  the  structure  or  building  blocks  of  an  XML  document.  Some 
of  these  structures  include  the  elements  and  attributes,  the  hierarchy  and  number  of 
occurrences  of  elements,  and  data  types,  among  others. 

WSDL  allows  the  creation  of  XML  documents  that  define  the  “contract”  for  a  Web 
Service.  The  “contract”  details  the  acceptable  requests  that  will  be  honored  by  the  Web 
Service  and  the  types  of  responses  that  will  be  generated  [CE  2002].  The  “contract”  also 
defines  the  XML  messaging  mechanism  of  the  service.  The  messaging  mechanism,  for 
example,  may  be  specified  as  SOAP. 

A  UDDI  registry  provides  a  way  for  data  providers  to  advertise  their  Web  Services 
and  for  consumers  to  find  data  providers  and  desired  services.  Data  provided  about  a 
Web  Service  can  be  categorized  much  like  information  in  a  telephone  book  into  “white” 
pages,  “yellow”  pages  and,  unlike  a  telephone  book,  the  “green”  pages.  The  white  pages 
include  basic  provider  information  such  as  name,  address,  business  description  and 
contact  information.  The  yellow  pages  provide  services  listed  by  category  as  determined 
by  the  American  Industry  Classification  System  and  the  Standard  Industrial 
Classification.  The  white  and  yellow  pages  include  enough  information  for  a  consumer 
to  determine  whether  they  need  the  technical  specification  for  the  service,  which  is 
contained  in  the  green  pages.  The  green  pages  may  either  contain  or  point  to  the  WSDL 
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file.  An  interface  to  a  UDDI  registry,  may  allow  users  to  search  for  Web  Services  by 
business  category,  business  name  or  service. 

It  is,  of  course,  not  necessary  to  register  a  Web  Service  with  a  UDDI  registry. 
However,  that  would  be  similar  to  a  business  not  listing  its  telephone  number  in  a 
telephone  directory.  Not  having  a  listing  would  make  it  more  difficult  for  consumers  to 
discover  and  utilize  a  Web  Service. 

A  graphic  representation  of  the  Web  Services  protocol  stack  as  described  above  is 
shown  in  Figure  2.  [CE  2002].  A  Web  Service  describes  its  interface  with  a  WSDL  file 
and  may  be  registered  in  a  UDDI  registry.  Interfaces  defined  in  XML  often  identify 
SOAP  as  the  required  XML  messaging  protocol.  SOAP  allows  for  the  exchange  of 
information  between  computers  regardless  of  platform  or  language. 

A  sample  use  of  the  protocol  stack  is  illustrated  in  Figure  3.  The  Web  Service 
publishes  its  existence  with  one  or  more  UDDI  registries.  Next,  a  user  discovers  the 
service  from  a  UDDI  registry  and  retrieves  a  description  of  the  service.  The  user  then 
either  automatically  invokes  the  service  or  writes  an  application  that  invokes  the  service 
by  sending  an  XML  message  over  the  specified  transport  to  the  service.  The  Web 
Service  then  returns  an  XML  message  over  the  specified  transport. 


Web  Services  Discovery  -  UDDI 

Web  Services  Description  WSDL 

XML  Messaging  Protocol  SOAP 

Transport  Protocol  HTTP 


Figure  2.  Web  Services  Protocol  Stack. 
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There  are  applications  that  provide  services  on  the  Web  without  using  all  components 
of  the  Web  Services  protocol  stack  described  above.  These  Web-based  services  employ 
diverse  methods  for  discovery,  description,  messaging  and  transport.  Within  these  Web- 
based  services  adherence  to  standards  and  protocols  vary. 


Figure  -3.  Illustrated  Use  of  Web  Services 


Web  Services  for  MetOc  Data 

Our  current  concentration  in  net-centric  operations  is  focused  on  improving  delivery 
of  MetOc  data  in  order  to  achieve  this  information  superiority  for  tactical  operations 
planning.  Some  specific  architectures  using  Web  Services  and  Web-based  services  for 
such  data  are  described  next  in  this  section. 

The  Navy  Enterprise  Portal  (NEP)  is  a  Web  Service  access  portal.  The  NEP  provides 
Web-browser  based  user  interfaces  or  user-facing  services,  which  interact  with  data 
oriented  services  on  remote  servers.  A  data  oriented  service  is  not  tightly  coupled  to  any 
client  application.  The  NEP  allows  the  user  to  simultaneously  access  multiple  user  facing 
services  from  the  same  Web-browser  interface  [NA  2004] . 
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Even  with  the  advent  of  Web  Services  and  Web-based  services,  human  resources  are 


still  required  to  integrate  these  data  sources  into  applications.  Compatibility  of  XML 
schema  versions  is  an  inherent  issue,  and  Web  Services  based  on  common  XML  schemas 
may  be  implemented  in  a  manner  to  create  inconsistent  results. 

GIDB,  for  example,  does  not  automatically  discover  new  Web  Services  or  Web-based 
data  services.  A  human  in  the  loop  is  necessary  to  find  relevant  data  on  the  Internet  and 
write  application  code  to  connect  the  GIDB  Portal  System  to  the  data  source.  The  GIDB 
currently  connects  to  over  600  servers  offering  over  2,500  services.  The  fact  that  some  of 
the  code  used  to  connect  to  these  servers  is  common  to  multiple  servers  helps  with  code 
development  and  maintenance. 

While  GIDB  establishes  a  single  portal  to  multiple  servers,  there  are  efforts  to 
establish  a  uniform  Web  Service  within  various  communities  of  interest  that  can  be 
separately  implemented  by  multiple  data  providers.  These  sorts  of  efforts  seek  to 
accomplish  this  through  adoption  of  a  specified  XML  Schema  and  WSDL.  Our 
experience  has  been,  however,  that  the  implementation  of  the  Web  Service  by  different 
data  providers  can  create  the  likelihood  of  varying  implementations  that  may  impact 
interoperability.  In  these  cases,  client  side  code  that  conforms  to  the  particular 
implementation  must  be  developed.  Based  on  our  experience,  service  providers  can 
choose  to  implement  as  much  or  as  little  of  the  specified  Schema  as  they  wish.  An  XML 
Schema,  for  example,  may  allow  users  to  request  data  that  has  been  modified  since  a 
specified  date  and  time.  Because  of  the  variations  in  implementation  of  these  Web 
Services,  while  one  service  provider  supports  data  responses  to  this  request,  another 
service  provider  returns  an  error  message.  Although  both  providers  produce  gridded 
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numerical  forecast  model  output  on  a  scheduled  timetable,  the  provider  producing  the 
error  message  does  not  believe  that  any  users  would  request  its  data  in  that  manner. 

A  need  exists  for  resolving  semantic  and  business  rule  differences  that  result  from 
specific  implementations.  While,  as  described  above,  efforts  toward  a  unified  domain- 
specific  Web  Service  may  define  a  syntax  that  allows  standardization  of  terms  used  to 
request  MetOc  information  and  respond  to  such  requests,  the  semantics  are  not  tightly 
defined.  These  Web  Service  implementers  are  free  to  each  implement  a  different  sub-set 
of  the  specified  Schema  and  each  may  interpret  various  elements  and  attributes  in 
incompatible  ways.  Uniform  conventions  may,  of  course,  reduce  this  ambiguity. 

MetOc  Broker 

With  Web  Service  technology  playing  an  ever-increasing  role  in  net-centric  operations 
and  new  web  services  becoming  available,  the  need  exists  for  applications  to  quickly  and 
easily  integrate  with  these  web  services.  As  we  have  discussed  web  services  technology 
has  freed  developers  from  platform  and  programming  language  constraints,  but  it  has  not 
yet  freed  developers  from  writing  code  that  connects  to  server  applications.  Web  Service 
technology  merely  defines  the  specifications  (WSDL  and  XML  Schemas)  to  which  the 
client  application  developer  must  conform.  These  schemas  may  be  complex  and  in 
addition,  structural  and  semantic  differences  may  exist  between  web  services. 

Since  web  services  give  the  promise  of  discoverable,  self-describing  services  that 
conform  to  common  standards,  their  use  should  allow  the  possibility  of  an  efficient  and 
automated  capability  to  obtain  and  integrate  data  [CE  2002].  Ideally,  with  this  automated 
capability  it  should  be  possible  to  obtain  and  integrate  data  (1)  from  alternate  sources 
when  data  becomes  unavailable  from  a  previously  reliable  source,  (2)  from  newly 
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identified  data  sources  that  possibly  employ  previously  unseen  schemas  or  (3)  from  a 
known  source  that  changes  its  interface  definition. 

Our  approach  to  these  problems  was  the  development  of  an  Advanced  MetOc  Broker 
(AMB),  which  supports  the  automated  identification,  retrieval,  and  fusion  of  MetOc  data 
from  new  and  ad  hoc  web  services.  Our  approach  to  automating  the  AMB’s  recognition 
of  terms  used  by  new  web  services  for  data  requests  and  responses  is  to  apply  MetOc 
ontologies  to  meteorological  and  oceanographic  forms  of  data  [FD  1999;  FO  2002;  AL 
2003].  Since  the  MetOc  domain  is  well  understood,  this  process  can  overcome  many 
semantic  limitations  inherent  to  MetOc  web  services.  The  AMB  uses  a  mapping  function 
to  resolve  semantic  differences  and  integrate  data.  The  description  of  concepts  and  terms 
inherent  in  MetOc  ontologies  provide  the  resolution  of  different  schemas  that  may  have 
varying  semantics  but  describe  similar  data  requests  and  responses. 

Figure  4  shows  an  example  of  a  high  level  conceptual  description  of  the  mapping 
process  in  which  ontology  usage  may  enable  an  automated  mapping  process.  A  data 
provider  uses  the  term  “temp”  and  the  AMB  uses  the  term  “air_temperature”.  These 
need  to  be  mapped  as  equivalent.  This  is  shown  in  the  mapping  with  source  term  “temp” 
mapping  to  “temperature”.  The  CONCEPT_AIR  in  the  ontology  (mapping  dictionary)  is 
used  to  resolve  this  mapping.  Therefore  MetOc  Web  Services  using  domain-relevant 
terminology  are  discoverable  by  the  AMB  and  the  AMB  can  resolve  requests  to  and 
responses  from  these  new  web  services. 

Although  our  focus  is  on  the  MetOc  domain,  the  methodology  employed  by  the 
AMB  is  extendable  to  other  spatial  domains.  Systems  based  on  this  approach  would  not 
require  extensive  client  application  development  for  each  new  web  service  from  which 
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data  can  be  retrieved.  Similarly,  extensive  client  application  maintenance  would  not  be 
required  for  each  schema  alteration  that  may  be  made  to  the  schemas  of  existing  web 
services. 


Provider  and  Requester 
Pair 


<Provider  Description 
<temp> 

<.../> 

<.../> 

<DEG-FARENHEIT/> 

</  temp  > 

</Provider> 


V 


<Requester  Descriptions 
<air_temperature> 

<.../> 

<TEMP  UNIT=“F”/> 
<.../> 

<  air_temperature  > 
</Requester> 


r 


t 

CONCEPTAIR 

terms:  <  temp,  air_temperature  > 
attribute:  <CONCEPT_SPEED> 
attribute:<CONCEPT_DIRECT> 
attribute:<CONCEPT_TEMPERATURE> 

...} 


Mapper 

•  Term  matching  based  on 
higher  level  concepts, 


Mapping 


<Provider_Requester_Mapping> 

<MAPPINGS> 

<MAP  > 

<SOURCE>  temp  </SOURCE> 
<TARGET>  air  temperature 
</TARGET> 

</MAP> 


</  Provider_Requester_Mapping> 


Figure  4:  Conceptual  View  of  Ontology  Mapping  Process 


Ontology  Development  for  MetOc  Data 

The  development  of  the  ontology  structure  for  the  AMB  has  involved  the  elicitation  of 
concepts,  terms,  etc.  from  multiple  sources.  An  example  in  Figure  5  focuses  on 
oceanographic  data  that  has  been  the  basis  of  our  initial  development  due  to  the 
availability  of  resources  and  experts  and  its  somewhat  simpler  structure.  We  have  used 
access  to  resident  oceanographic  data  experts  at  the  Naval  Research  Laboratory  to 
provide  an  initial  organization  of  oceanographic  concepts.  Additionally,  since  we  obtain 
data  from  various  web  sources  whose  terminology  must  be  reflected  in  the  ontologies,  we 
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have  included  in  structure  in  Figure  5  descriptions  of  the  sources  and  models  that  produce 


some  of  the  data. 


Figure  5.  Sample  Concepts/Terms  and  their  Relationships 


In  figure  6  we  show  the  relationship  between  the  terms  in  the  ontology  index  on  the 
left  of  the  figure  and  the  web  services  that  may  be  accessed  for  the  terms.  So  the  MetOc 
broker  queries  using  terms  as  follows.  The  terms  from  the  input  request  will  be  used  to 
create  a  query  (e.g.  getdata,  sal  etc.)  and  the  query  term  (e.g.  sal  or  salinity)  will  be  used 
to  retrieve  concepts  using  the  term  concept  index.  The  concept  will  then  serve  as  the  key 
to  web  service  method  index  from  which  the  appropriate  web  services  and  their  methods 
will  be  retrieved.  If  more  than  one  web  service  and  method  are  retrieved  we  are 
evaluating  selection/filter  algorithms  to  rank  (e.g.,  by  confidence)  select  among  them. 
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The  ranking  will  reflect  measures  of  confidence  of  the  data's  availability,  reliability  and 
suitability.  This  may  include  confidence  parameters  that  reflect  the  data  source's  current 
availability,  the  status  of  the  source  (e.g.,  government,  military,  educational,  foreign,  etc.) 
and  the  timeliness  of  the  data.  Once  the  candidate  web  service  and  their  methods  are 
retrieved  the  input  request  will  need  to  be  translated  to  the  request  format  of  the  identified 
web  service. 


Ontology 

Term  concept  index 


Term/key 

Concept 

sal 

SALINITY 

Web  Service  Method  Index 


Concept/key 

Web  Ser 

Method 

SALINITY 

WS1 

Ml 

WS55 

M3 

WS21 

M2 

CRIT  DEP.. 

WS3 

Ml 

WS2 

M3 

WS4 

M4 

Figure  6.  Ontology  and  Web  Services  Indexes 


GSO  System 

The  ontology  development  system  we  used  for  the  AMB  is  based  on  Generative 
Sublanguage  Ontologies  (GSO),  a  type  of  linguistic  ontology  inspired  by  the  Generative 
Lexicon  Theory  [GA  2003;  GA  2005].  These  approaches  provide  a  compact  conceptual 
representation  of  related  word  meanings  that  can  be  used  to  robustly  and  accurately 
interpret  natural  language  sentences.  They  also  provide  generative  operators  that  can  be 
used  to  select  the  correct  meaning  of  a  word  from  the  possible  alternatives  for  a  given 
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context.  Their  robustness  arises  from  the  ability  to  use  these  operators  for  situations 
where  the  words  are  used  in  creative  and  unanticipated  ways. 

GSOs  are  one  of  the  first  implementations  of  the  Generative  Lexicon  Theory  and  has 
the  following  architecture  implemented  in  Java  (see  Figure  7).  The  GSO  Editor  is  a 
graphical  user  interface  that  can  be  used  to  add,  edit,  and  modify  the  ontology  for  a 
selected 
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Figure  7.  GSO  system  architecture 

application  domain  such  as  AMB.  It  provides  a  user-friendly  drag  and  drop  interface  and 
performs  knowledge  integrity  checks  during  editing.  For  example,  it  prevents  the  user 
from  specifying  cyclic  inheritance  relations  and  prevents  the  user  from  deleting  concepts 
that  are  used  in  the  representation  of  others.  It  relieves  the  user  of  performing  several 
representation  consistency  and  completeness  checks.  These  checks  are  performed  by  the 
GSO  Engine  among  its  other  functions.  Host  applications  such  as  AMB  access  the 
application  specific  ontology  via  the  GSO  Engine  interface,  which  is  capable  for 
responding  to  various  queries  from  the  host.  For  example,  in  AMB,  the  following  query 
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could  be  issued  by  AMB:  “Get  all  concepts  pertaining  to  the  term  salinity’’'.  The  GSO 
engine  would  return  the  corresponding  GSO  concept  SALINITY,  which  states  that  it  is  a 
property  of  water  and  in  particular  seawater.  In  addition,  the  GSO  Engine  also  provides 
various  functions  to  compute  synonymy  and  similarity  computation  across  concepts  that 
can  be  used  for  partial  mapping.  The  ontology  comprises  two  main  components:  The 
terms  and  the  concepts  that  they  point  to.  The  concepts  are  represented  using  the  GSO 
representation  approach,  which  is  a  first  order  predicate  calculus  representation 
embedded  in  an  object-oriented  framework  [GA  2003]. 

This  is  illustrated  by  the  oceanographic  data  design  components  in  Figures  8  and  9. 
The  high  level  concept  Ocean  is  first  shown  followed  by  the  subconcepts,  Surface  and 
Subsurface.  Next  two  properties.  Salinity  and  Depth  are  illustrated. 


OCEAN  id  :1 


type_of:  METOC 
terms:  ocean(n) 

properties: 

!  METOC_PROPERTY(this,  value) 


(a)  Ocean  Structure 


r 

SURFACE  id:  12 

A 

type_of: 

ENTITY 

terms: 

surface(n) 

gloss: 

Surface  of  an  object 

^  constituent 

;  PART_WHOLE  (~whole:OCEAN,  -part:  this) 

A 

(b)  Surface  Structure 
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SUBSURFACE  id:  13 


type_ot:  ENTITY 

terms:  subsurface(n) 

gloss:  An  entity  that  is  part  of  the  ocean  and  located  below  the  surface  at  a 

distance  x  meters,  x  >  0  and  x<  ? 


constituent: 


PART_WHOLE  (-whole:  OCEAN  ,  -part:  this) 

LOCATED  (~entity:this,  refEntity:  SURFACE,  -direction: 
BELOW,  -distance:  x  meters) 


(c)  Subsurface  Structure 
Figure  8.  Ontology  Diagrams 

Concepts  are  shown  in  a  rounded  box  with  the  name  of  the  concept  at  the  head  of 
the  box.  Slot  names  are  italicized  and  are  in  lower  case.  Two  reserved  symbols  are 
“this”  referring  to  the  concept  itself,  and  a  GSO  symbol  showing  that  a  slot  is 
inherited  from  one  of  the  ancestors.  Arguments  are  referred  by  the  aliases  indicated 
by  a  tilde  like  a  variable  name  in  an  object.  Terms  are  lower  case  non-italicized, 
comprising  one  or  more  words  and  or  symbols.  Terms  have  an  associated  part  of 
speech  such  as.  (v)  indicating  a  verb,  (n)-  noun,  (a)-  adjective,  etc. 


SALINITY 

type_of:  PROPERTY,  PARAMETER 

terms:  salinity(n),  sal(n) 

gloss:  Salinity  of  sea  water 


Argument  Alias 

Argument  Type 

~object 

SEAWATER 

-level 

MEASURED_VALUE 

Behavior:  !METHOD(~retumedObject:?,  this) 


(a)  Salinity  Structure 
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DEPTH 


Id:  30 


type_of:  PROPERTY,  PARAMETER 
terms:  depth(a),  depth(n) 

gloss:  Depth  at  which  the  observations  are  made 

Argument  Structure: 


Argument  Alias 

Argument  Type 

-entity 

ENTITY 

-entity’sTop 

TOP(-entity) 

-depthLoc 

ENTITY  or  BOTTOM(-entity) 

-deep 

VALUE/NUMBER 

Behaviour(s):  LOCATED(~object:  -depthLoc,  -refObject:  -entity'sTop, 
-direction:  BELOW,  -distance:  -deep) 

!METHOD(~retumedObject:?,  this) 


(b)  Depth  Structure 
Figure  9.  Ontology  Diagrams 


Fuzzy  Ontology  Extensions 

In  the  ontology  approach  we  have  described  above,  matching  of  differing  terms  is 
based  on  syntactic  variations  and/or  the  relationships  implicit  in  the  ontology’s 
conceptual  structure.  However  a  valuable  extension  would  be  to  consider  the  capability 
for  approximate  matching  that  captures  in  some  fashion  a  degree  of  matching.  This  may 
be  desirable  to  be  taken  into  account  to  provide  weighted  overall  matches  on  discovery  of 
data  sources.  Related  issues  have  previously  arisen  in  the  hierarchical  structures  that  are 
used  in  both  fuzzy  object-oriented  databases  [GE  1997]  and  in  the  conceptual  hierarchies 
for  association  rule  [CH  2000,  LA  2003]  and  attribute-oriented  generalization  data 
mining  approaches  [AP  2003].  In  these  the  relationship  among  terms  can  differ  in  the 
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database  querying  aspect  or  in  the  data  mining  applications  in  which  terms  in  the  data 
warehouse  are  not  exact  matches  to  the  given  hierarchical  structure.  In  such  cases  the  use 
of  measures  such  as  fuzzy  similarity  or  proximity  relationships  among  the  terms  has 
proven  fruitful. 

Currently  there  are  specific  efforts  to  apply  fuzzy  ontologies  to  web  searching  in  the 
context  of  document  retrieval  [WY  2001,  PA  2004].  Such  ontologies  are  typically  based 
on  a  corpus  of  documents,  abstracts  or  citations.  This  corpus  is  then  analyzed  to  generate 
the  fuzzy  ontology  based  on  analyses  of  frequencies  of  term  occurrences/co-occurrences. 

In  the  environment  of  Web  Services  a  similar  approach  can  be  taken  to  exploring 
UDDI  registries  for  appropriate  Web  Services  and  to  basing  a  term  analyses  on  these  as 
described  above.  However  since  we  are  also  often  focused  on  a  specific  domain,  as  in 
our  application  for  MetOc  data,  then  it  is  to  be  expected  that  there  must  also  be  a  part  of 
the  ontology  based  on  this  specific  domain’s  structure.  Typically  elicitation  from 
experts/expert  sources  is  utilized  for  this,  and  we  can  expect  issues  of  term  similarity  that 
arise  from  such  multiple  sources  to  be  able  to  be  captured  in  a  fuzzy  ontology  structure. 
Finally  various  domain  ontologies  for  many  specific  areas  are  rapidly  being  developed 
around  the  world.  To  make  use  of  such  pre-existing  ontologies,  we  believe  will  require 
their  merging/  intersection.  This  merging  would  also  be  facilitated  by  a  fuzzy  ontology 
in  order  to  support  term  differences  that  occur  across  the  various  ontologies.  Indeed 
current  research  that  is  underway  indicates  that  this  is  a  feasible  goal  [TA  2005], 
Conclusions 

This  paper  has  illustrated  an  approach  to  the  extension  of  geographic  information 
systems  to  take  advantage  of  the  continuing  development  of  capabilities  of  the  Semantic 
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Web.  We  have  shown  this  in  the  context  of  a  specific  domain,  MetOc  data,  but  clearly 
this  approach  can  be  applied  to  other  spatial  data  domains.  Most  important  for  this 
extension  is  the  ability  to  develop  effective  domain  ontologies.  Extensive  work  is  under 
way  in  all  application  areas  to  develop  broadly  encompassing  ontologies  including  that  of 
geographic  data  [KU  2001,  KL  2004,  PK  2004].  It  is  clear  that  it  is  extremely  important 
to  extend  the  capabilities  of  GIS  to  take  advantage  of  the  Semantic  Web  and  our 
approach  illustrates  one  such  possible  extension 
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